Account for KubeRay Redis cleanup resources#11260
Conversation
✅ Deploy Preview for kubernetes-sigs-kueue ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: nerdeveloper The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @mimowo, could you please take a look when you have a chance? |
10f92dc to
71f1c08
Compare
| } | ||
|
|
||
| func hasGCSFaultTolerance(rayClusterSpec *rayv1.RayClusterSpec) bool { | ||
| return rayClusterSpec.GcsFaultToleranceOptions != nil |
There was a problem hiding this comment.
Could you check with the KubeRay code is this is a sufficient check? I remember that there was also a knob at the kuberay deployment that would be the global enabler, which is disabled would make this check irrelevant.
There was a problem hiding this comment.
Checked against KubeRay v1.6.1. The cleanup path is gated by ENABLE_GCS_FT_REDIS_CLEANUP != "false" and utils.IsGCSFaultToleranceEnabled(&instance.Spec, instance.Annotations). I updated Kueue to mirror the observable part of that predicate by using rayutils.IsGCSFaultToleranceEnabled(spec, annotations), and added the Kueue-side feature gate as a bailout for clusters where the KubeRay operator env disables cleanup.
| podSets = append(podSets, workerPodSet) | ||
| } | ||
|
|
||
| if hasGCSFaultTolerance(rayClusterSpec) { |
There was a problem hiding this comment.
Since this is a delicate change let's introduce a feature gate as a bailout option, say KubeRayAccountForRedisCleanup which is Beta, and add a comment o GA in 0.21 in Kueue.
There was a problem hiding this comment.
Done in the latest push. I added KubeRayAccountForRedisCleanup as Beta, default-on, with the GA-in-0.21 comment and regenerated the feature-gate YAML. I also added coverage for disabling the gate, plus the annotation-based GCS FT path. Local verification: go test ./pkg/features ./pkg/controller/jobs/raycluster ./pkg/controller/jobs/rayservice ./pkg/controller/jobs/rayjob and make verify-ci-lint both pass.
mimowo
left a comment
There was a problem hiding this comment.
cc @yaroslava-serdiuk ptal
71f1c08 to
2d125f3
Compare
2d125f3 to
2dbbdd9
Compare
|
So here we represent a RedisCleanupJob as an additional PodSet in the Workload and so decreasing the max number of workers group in the RayCluster from 7 to 6. I'm not sure if this is an issue, but it might be. |
|
/assign |
|
|
||
| const ( | ||
| redisCleanupPodSetName = kueue.PodSetReference(rayv1.RedisCleanupNode) | ||
| maxRayClusterPodSets = 8 |
There was a problem hiding this comment.
It's a kueue limitation, not a raycluster, so let's put it in the constants file and name it MaxPodSetsSize or something similar.
| return count | ||
| } | ||
|
|
||
| func maxWorkerGroupsForSpec(rayClusterSpec *rayv1.RayClusterSpec, annotations map[string]string) int { |
There was a problem hiding this comment.
I don't think we need both ExpectedPodSetsCount and maxWorkerGroupsForSpec, since they are counting the same thing.
For example, instead of limiting workers in 284, we can just verify that ExpectedPodSetsCount is <= maxPodSetsSize (8).
| } | ||
| } | ||
|
|
||
| func TestBuildPodSetsWithGCSFaultToleranceAnnotation(t *testing.T) { |
There was a problem hiding this comment.
Let's merge those two cases and have one table-driven test that verifies combinations for feature gate enabled/disabled and GCSFaultTolerance present/not present .
| } | ||
| } | ||
|
|
||
| func TestValidateCreateRayClusterSpecWithRedisCleanupFeatureGateDisabled(t *testing.T) { |
There was a problem hiding this comment.
Instead of introducing a separate test, let's include this test as a case of the test above.
|
/test pull-kueue-priority-booster-test-integration-main |
What this PR does / why we need it
When KubeRay GCS fault tolerance is enabled, KubeRay creates a Redis cleanup Job during RayCluster deletion. Kueue previously built Workload PodSets only for the Ray head and workers, so the cleanup Job resources were not reserved.
This adds a synthetic redis-cleanup PodSet with the same 200m CPU and 256Mi memory requests used by KubeRay, and updates RayCluster/RayService PodSet count handling and validation.
Which issue(s) this PR fixes
Fixes #10946
Special notes for your reviewer
Verified with a local Kind repro that the Workload now includes head, workers, and redis-cleanup PodSets, with ClusterQueue reservation cpu=2200m and memory=1280Mi for the minimal repro.
Does this PR introduce a user-facing change?
Additional documentation
None